Continuous-time Markov decision processes with nth-bias optimality criteria

نویسندگان

  • Junyu Zhang
  • Xi-Ren Cao
چکیده

In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias optimal policy iteration algorithms, and show that such an nth-bias optimal policy can be obtained in a finite number of policy iterations. © 2009 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes

In this note, we compare two approaches for handling risk-variability features arising in discrete-time Markov decision processes: models with exponential utility functions and mean variance optimality models. Computational approaches for finding optimal decision with respect to the optimality criteria mentioned above are presented and analytical results showing connections between the above op...

متن کامل

The n th-Order Bias Optimality for Multichain Markov Decision Processes

The paper proposes a new approach to the theory of Markov decision processes (MDPs) with average performance criteria and finite state and action spaces. Using the average performance and bias difference formulas derived in this paper, we develop an optimization theory for average performance (or gain) optimality, bias optimality, and all the high-order bias optimality, in a unified way. The ap...

متن کامل

A Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Processes y

Since the long-run average reward optimality criterion is underselective, a decisionmaker often uses bias to distinguish between multiple average optimal policies. We study bias optimality in unichain, nite state and action space Markov Decision Processes. A probabilistic approach is used to give intuition as to why a bias-based decision-maker prefers a particular policy over another. Using rel...

متن کامل

Discrete-time Markov control processes with discounted unbounded costs: Optimality criteria

We consider discrete-time Markov control processes with Borel state and control spaces, unbounded costs per stage and not necessarily compact control constraint sets. The basic control problem we are concerned with is to minimize the infinite-horizon, expected total discounted cost. Under easily verifiable assumptions, we provide characterizations of the optimal cost function and optimal polici...

متن کامل

On $L_1$-weak ergodicity of nonhomogeneous continuous-time Markov‎ ‎processes

‎In the present paper we investigate the $L_1$-weak ergodicity of‎ ‎nonhomogeneous continuous-time Markov processes with general state‎ ‎spaces‎. ‎We provide a necessary and sufficient condition for such‎ ‎processes to satisfy the $L_1$-weak ergodicity‎. ‎Moreover‎, ‎we apply‎ ‎the obtained results to establish $L_1$-weak ergodicity of quadratic‎ ‎stochastic processes‎.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Automatica

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2009